Accelerated gradient methods for nonconvex nonlinear and stochastic programming
نویسندگان
چکیده
In this paper, we generalize the well-known Nesterov’s accelerated gradient (AG) method, originally designed for convex smooth optimization, to solve nonconvex and possibly stochastic optimization problems. We demonstrate that by properly specifying the stepsize policy, the AG method exhibits the best known rate of convergence for solving general nonconvex smooth optimization problems by using first-order information, similarly to the gradient descent method. We then consider an important class of composite optimization problems and show that the AG method can solve them uniformly, i.e., by using the same aggressive stepsize policy as in the convex case, even if the problem turns out to be nonconvex. More specifically, the AG method exhibits an optimal rate of convergence if the composite problem is convex, and improves the best known rate of convergence if the problem is nonconvex. Based on the AG method, we also present new nonconvex stochastic approximation methods and show that they can improve a few existing rates of convergence for nonconvex stochastic optimization. To the best of our knowledge, this is the first time that the convergence of the AG method has been established for solving nonconvex nonlinear programming in the literature.
منابع مشابه
Generalized Uniformly Optimal Methods for Nonlinear Programming
In this paper, we present a generic framework to extend existing uniformly optimal convex programming algorithms to solve more general nonlinear, possibly nonconvex, optimization problems. The basic idea is to incorporate a local search step (gradient descent or Quasi-Newton iteration) into these uniformly optimal convex programming methods, and then enforce a monotone decreasing property of th...
متن کاملStochastic Compositional Gradient Descent: Algorithms for Minimizing Nonlinear Functions of Expected Values
Classical stochastic gradient methods are well suited for minimizing expected-valued objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values, i.e., problems of the form minx f ( Ew[gw(x)] ) . In this paper, we propose a class of stochastic compositional gradient descent (SCGD) algorithms that can be viewed as stochastic versions of q...
متن کاملStochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method pos...
متن کاملConvergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization
In many modern machine learning applications, structures of underlying mathematical models often yield nonconvex optimization problems. Due to the intractability of nonconvexity, there is a rising need to develop efficient methods for solving general nonconvex problems with certain performance guarantee. In this work, we investigate the accelerated proximal gradient method for nonconvex program...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Math. Program.
دوره 156 شماره
صفحات -
تاریخ انتشار 2016